Adaptive model-based speech enhancement

نویسندگان

  • Beth Logan
  • Tony Robinson
چکیده

Declaration This dissertation is the result of my own work and includes nothing which is the outcome of work done in collaboration except where stated. It has not been submitted in whole or part for a degree at any other university. The length of this thesis including footnotes and appendices is approximately 37000 words. i Summary This dissertation details the development and evaluation of techniques to enhance speech corrupted by unknown independent additive noise when only a single microphone is available. It therefore seeks to address a deeciency of many speech enhancement systems which require a priori knowledge of the interfering noise statistics. Such a deeciency must be corrected if these systems are to operate in real world situations. The enhancement systems developed are based on an existing system by Ephraim (Ephraim 1992a). This approach models the speech and noise statistics using autoregressive hidden Markov models (AR-HMMs). Two main extensions to this technique are developed in order to make it adap-tive. The rst estimates the noise statistics from detected pauses. The second forms maximum likelihood estimates of the unknown noise parameters using the whole utterance. Both techniques operate within the AR-HMM framework. Additional work in this dissertation improves the modelling power of AR-HMM systems by incorporating perceptual frequency. The bilinear transform is used to warp the frequency spectrum of the feature vectors to an approximation of the Bark scale. This modiication can be incorporated into both AR-HMM recognition and enhancement systems. The enhancement techniques are evaluated on the NOISEX-92 and Resource Management (RM) databases, giving indications of performance on simple and more complex tasks respectively. Additional experiments investigating the incorporation of perceptual frequency into AR-HMM systems were conducted on the E-set of the speaker independent ISOLET database. Both enhancement schemes proposed were able to improve substantially on baseline results. The technique of forming maximum likelihood estimates of the noise parameters was found to be the most eeective. Its performance was evaluated over a wide range of noise conditions ranging from-6dB to 18dB and on various types of stationary real-world noises. The incorporation of perceptual frequency into AR-HMM systems was found to increase recognition performance substantially on both the ISO-LET and RM databases. The improvement was less marked for the more complex task, highlighting that AR-HMMs could beneet from the inclusion of more variance information. Acknowledgements First I would like to thank my supervisor Tony Robinson. He provided me with the …

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Enhancement by Modified Convex Combination of Fractional Adaptive Filtering

This paper presents new adaptive filtering techniques used in speech enhancement system. Adaptive filtering schemes are subjected to different trade-offs regarding their steady-state misadjustment, speed of convergence, and tracking performance. Fractional Least-Mean-Square (FLMS) is a new adaptive algorithm which has better performance than the conventional LMS algorithm. Normalization of LMS ...

متن کامل

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...

متن کامل

Speech Enhancement using Adaptive Data-Based Dictionary Learning

In this paper, a speech enhancement method based on sparse representation of data frames has been presented. Speech enhancement is one of the most applicable areas in different signal processing fields. The objective of a speech enhancement system is improvement of either intelligibility or quality of the speech signals. This process is carried out using the speech signal processing techniques ...

متن کامل

Speech enhancement based on hidden Markov model using sparse code shrinkage

This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...

متن کامل

Utilizing Kernel Adaptive Filters for Speech Enhancement within the ALE Framework

Performance of the linear models, widely used within the framework of adaptive line enhancement (ALE), deteriorates dramatically in the presence of non-Gaussian noises. On the other hand, adaptive implementation of nonlinear models, e.g. the Volterra filters, suffers from the severe problems of large number of parameters and slow convergence. Nonetheless, kernel methods are emerging solutions t...

متن کامل

Speech Enhancement using Laplacian Mixture Model under Signal Presence Uncertainty

In this paper an estimator for speech enhancement based on Laplacian Mixture Model has been proposed. The proposed method, estimates the complex DFT coefficients of clean speech from noisy speech using the MMSE  estimator, when the clean speech DFT coefficients are supposed mixture of Laplacians and the DFT coefficients of  noise are assumed zero-mean Gaussian distribution. Furthermore, the MMS...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Speech Communication

دوره 34  شماره 

صفحات  -

تاریخ انتشار 2001